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Abstract 

Phylogenetic invariants are not the only constraints on site-pattern 
frequency vectors for phylogenetic trees. A mutation matrix, by its defi- 
nition, is the exponential of a matrix with non-negative off-diagonal en- 
tries; this positivity requirement implies non-trivial constraints on the 
site-pattern frequency vectors. We call these additional constraints "edge- 
parameter inequalities." In this paper, we first motivate the edge-parameter 
inequalities by considering a pathological site-pattern frequency vector 
corresponding to a quartet tree with a negative internal edge. This site- 
pattern frequency vector nevertheless satisfies all of the constraints de- 
scribed up to now in the literature. We next describe two complete sets of 
edge-parameter inequalities for the group-based models; these constraints 
are square-free monomial inequalities in the Fourier transformed coordi- 
nates. These inequalities, along with the phylogenetic invariants, form 
a complete description of the set of site-pattern frequency vectors corre- 
sponding to bona fide trees. Said in mathematical language, this paper 
explicitly presents two finite lists of inequalities in Fourier coordinates of 
the form "monomial < 1," each list characterizing the phylogenetically 
relevant semialgebraic subsets of the phylogenetic varieties. 



1 Introduction 

The Bayesian and maximum-likelihood methods in phylogcnctics can be clas- 
sified as "model based." That is, at some stage in the analysis, one assumes a 
mutation model and calculates the likelihood of the observed data for a given 
tree and set of model parameters. We will call the set of site-pattern frequency 
vectors generated on a fixed tree by a mutation model under legal parameter 
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settings a "tree image." One of the main goals of the emerging field of phylo- 
genetic geometry [1-5] is to locate these tree images in site pattern frequency 
space. Such work is foundational to understanding when model-based phyloge- 
netics does and does not succeed. 

The mutation models for sequences evolving on a tree are typically given 
in terms of nucleotide mutation models, which are stochastic matrices giving 
the probability of various mutations at an arbitrary site. One such matrix is 
associated with each edge; consequently one multiplies matrices along paths in 
the tree to get the mutation matrix along that path. Because a series of matrix 
multiplications is polynomial in the entries of the matrices, one can consider the 
tree image as a subset of an affine variety. 

It is then natural to apply the well-developed tools of algebraic geometry to 
analyze these varieties. In particular, there has been a flourishing of interest 
in the corresponding ideals of these varieties; in the present setting these are 
called "phylogenetic invariants" [1,3,5-7]. Although not completely understood 
for all models, a considerable amount of beautiful work has been done on these 
invariants; a very nice overview has been published in [8]. 

One can then formulate a constrained optimization problem by optimiz- 
ing the likelihood function across the set of site-pattern frequency vectors con- 
strained to satisfy the phylogenetic invariants. This is the view taken by [9] 
(equation (3)) where it is called the maximum likelihood problem. Another 
article [5] says "exact computation of maximum likelihood estimates. . . can be 
formulated. . . as a constrained optimization problem where the probabilities are 
the decision variables and the phylogenetic invariants are the constraints." A 
similar statement has been made in a review article concerning the use of phy- 
logenetic invariants for tree reconstruction [10]. 

These statements may be confusing for computational biologists thinking 
of phylogenetic trees as descriptions of mutational processes occurring in the 
evolutionary past. Indeed, there are solutions to the phylogenetic invariants 
sitting in the probability simplex which do not correspond to any reasonable 
assignment of branch lengths (or, more generally, edge parameters) to a tree. 
In the language of algebraic geometry, the tree image is not equal to its Zariski 
closure intersected with the probability simplex. This observation is not original 
to this paper: the authors of [2] define a useful notion of "biologically meaning- 
ful" solutions to the phylogenetic invariants. Their criterion is satisfied if the 
Fourier transform of the mutation matrices have non-negative diagonal entries. 
Positivity of Fourier transforms is indeed a necessary condition for a mutation 
matrix to come from a model (see Observation 12. 3[ ) . but is not sufficient as we 
demonstrate below in our motivating example. 

Our simple observation is this: mutation matrices are the result of a contin- 
uous time Markov process operating for some non-negative period of time. This 
fact is implicit in any description of mutation as a process in terms of rates, for 
example in the original description of the Kimura models [11]. In the notation 
of Markov processes, 
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where is the mutation matrix for an edge e, t e > is elapsed time, and 
is the mutation rate matrix. In this setting must be a "Q-matrix", 
i.e. have non- negative off diagonal entries and zero row sums [12]. 

The observation (TTJ) implies a collection of nontrivial square-free monomial 
inequalities in the Fourier transformed probability space which ensure that a 
solution to a complete set of phylogenetic invariants indeed corresponds to a 
bona fide tree. This paper develops a complete set of such inequalities; we call 
them "edge-parameter inequalities." 

First we present a very simple motivating example on the quartet tree to 
illustrate the need for edge-parameter inequalities. This example has a negative 
internal branch length, or, said another way, the mutation rate matrix along 
that edge contains negative off-diagonal entries. Despite this nonsensical setup, 
the associated site-pattern frequency vector satisfies the phylogenetic invariants 
and sits in the probability simplex. Furthermore, the parameters satisfy the 
useful "biologically meaningful" criterion of [2] , which as noted is necessary but 
not sufficient for a tree to have positive edge parameters. For our example we 
assume the two-state symmetric (CFN) model with uniform distribution at the 
root, labeling the two states and 1. In the CFN model, there is only a single 
parameter per edge, called the branch length. It is the amount of time which we 
allow our binary Poisson mutation process to run, thus the probability that the 
endpoints of an edge are in different states is 0.5(1 — exp(— 27)) for an edge of 
length 7. Let 9 = exp(— 27); the Fourier transform [13] of the mutation matrix 
of length 7 is thus diag(l, 9). 

Our motivating example is as follows: consider the tree on taxa 1, 2, 3 and 4 
with the 12 1 34 split. Make each pendant edge of length 7 and internal edge of 
length —7. Thus formally, by the above, the off-diagonal entries of the mutation 
rate matrix for the internal edge will be negative. We now show that if 7 > 
0.60938 then the expected site-pattern frequency vector for this tree will satisfy 
all of the restrictions described up to now in the literature. 

With the above notation, the nontrivial entry of the Fourier transform of the 
mutation matrix will be 9 for the pendant edges and # _1 for the internal edges. 
In this and the following sections, we use p to denote points of the probability 
simplex and q to denote points of the Fourier transform of the probability 
simplex. We will call the p "site-pattern frequency vectors" and the image of 
the probability simplex under the Fourier transform "g-space." We will index 
p and q with taxon state vectors g. 

We use Hadamard conjugation to compute q for the pathological tree. The 
formulation for general group-based models is given in Q, but for the CFN 
model the calculation of q is quite simple. To find a given q s , first let S s be 
the set of all taxa in state 1 according to g. Second, let E s be the set of edges 
in the (unique) collection of disjoint paths connecting the taxa in S s to each 
other. Then q g is simply the product of all nontrivial entries of the Fourier 
transform of the mutation matrices for edges in E g [13]. For example, the path 
collection corresponding to g = 1010 is the single path connecting taxa 1 and 
3, going through the internal edge. Thus gioio = 9 ■ 9~ 1 -9 = 9. All of the 
other similar calculations are reported in Table [TJ An application of the inverse 
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g 



8 -Pg 



0000 
1001 
0101 
1100 
0011 
1010 
0110 

1111 



1 



2 
2 


4 



1 + 46» + 26 2 + 4 



l-6> 4 
l-6> 4 



1-40 + 20 2 + 6 4 




Table 1: Site pattern frequencies and their Fourier transforms for the example 
mentioned in the text. 

Fourier transform gives the p. Note that because our root distribution is taken 
to be uniform, the Fourier transform of the root distribution is nonzero only at 
the identity. Thus the only nonzero q s are those for which the Z2 sum of the 
components of g equals zero. 

It is clear that in Table [T] all p s are positive for < < 1 with the possible 
exception of pnoo- One can ensure positivity of pnoo by choosing < < 
0.2955, corresponding to a branch length 7 > 0.60938. We fix such a choice 
of 0, which ensures that p sits in the probability simplex. (Note that a less 
stringent constraint on the branch lengths could be achieved by taking the 
absolute value of the internal branch length to be smaller than the pendant 
branch lengths.) Because our q comes from Hadamard conjugation, it satisfies 
the two phylogenetic invariants in this setting: giooi • 9ono — 9ioio ■ 9oioi and 
9oooo ■ 91111 = 91100 • <Zooii- Furthermore, the diagonal entries of the Fourier 
transform of the mutation matrices (i.e. 1, and 6 1-1 ) are positive for any 
real 7 < 0, and thus the mutation parameters satisfy the two-state analog of 
the 'biologically meaningful" criterion of [2]. However, this q came from a 
phylogenetic tree with a negative internal edge. Thus the example begs the 
question of what conditions should be put on site-pattern frequency vectors or 
their Fourier transforms so that one can be assured that the corresponding trees 
are well-formed. 

This paper describes the set of "edge-parameter inequalities" and shows that 
they are the exact conditions needed, namely that any solution of the phyloge- 
netic invariants for a given tree which satisfies these inequalities is guaranteed 
to come from a tree with non-negative edge parameters. For example, an edge- 
parameter inequality for the internal edge of the quartet tree is 



which is equivalent to the inequality 1 > A or 7 > 0. Thus ([2|) specifically rules 
out the pathological example above. 

We will describe two distinct versions of the edge-parameter inequalities. 
The first version is derived by considering paths in the tree and thus will be 
called the "path" edge-parameter inequalities. This version is relatively simple 



9oooo 91111 9iioo 9ooii > 91010 9oioi 91001 9oiio 
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to write down, involving two monomials of degree at most four for the two-state 
models and two monomials of degree at most six for the four-state models. We 
note that as this set of inequalities is derived on trees, they are only meaningful 
for q which satisfy a complete set of phylogenetic invariants for a tree. 

Next we present the second version of the inequalities; these inequalities 
derive directly from the Szekely-Steel-Erdos Fourier conjugation equation [14]. 
Because they are given directly by Fourier conjugation, we call these inequalities 
the "canonical" edge-parameter inequalities. These inequalities for group G- 
based models on trees of m taxa carve out a subset of g-space which we denote 
YG,m- The set of q's corresponding to a given m-taxon tree is the set of solutions 
to that tree's phylogenetic invariants intersected with Yc,m- 

We then investigate some properties of Yb,m- The set YG,m is the subset of q- 
space which corresponds precisely to the q of splits networks with non-negative 
split parameters using an extension of the model of [15]; thus it is contractible. 
It is not convex. Furthermore, the q corresponding to phylogenetic trees sit on 
the boundary of Yb.mj thus the complete space of phylogenetic "oranges" [4, 16] 
for group-based models lives on this boundary. 

Before getting into details, we would like to note that the idea of constraint 
inequalities goes back to the remarkable paper of Cavender and Felsenstein [3] . 
Indeed, they anticipate such inequalities, the (phylogenetic) Fourier transform, 
and problems with phylogenetic mixtures. Our paper can be seen as a com- 
pletion of their investigation of phylogenetic inequalities for the group-based 
models. 

2 Technical introduction 

In this section we fix notation and state two versions of Fourier conjugation. The 
application of discrete Fourier transform ideas to phylogenetics was pioneered 
in [17, 18] for the CFN model, then generalized to group-based models in [14] 
and [19]. Our notation combines that of [5] and [15]. We note that because 
Fourier conjugation is our primary tool, we will only be considering group-based 
mutation models (defined below), in particular G = I2 or Z2 x Z2. 

As stated in the introduction, the simple observation of this paper is that mu- 
tation transition matrices come from continuous-time Markov processes. Thus 
the mutation matrices must satisfy ([!} for each edge e, with t e and the 
off-diagonal elements of being non-negative. We allow the rate matrices 

to vary from edge to edge; thus we can (and do) incorporate t e into 
and so assume t e = 1 for any e. We call the resulting entries of the mutation 
rate matrices for an edge "edge parameters." We note that in phylogenetic 
practice one often assumes a fixed rate matrix Q for the whole tree and the only 
parameters of a given edge are the branch lengths t e ; here we make no such 
restriction. 

Fourier conjugation applies to the "group-based models." Each state in such 
a model is uniquely labeled with an element of an Abelian group. We will write 
our group G additively, with denoting the identity element. The essential 
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point in the definition of a group-based model is that such that the rate of 
transition from state g to h is only a function of the difference of g and h in G. 
Fixing an edge e, we write 

Q { ;l = ^ e) (h-g) 

where denotes the mutation rate matrix along an edge e and ip^ is an 
arbitrary |G|-vector with components summing to zero such that ip^Hd) ^ 
for g ^ 0. The group-based models considered in the literature are also time 
reversible, i.e. one requires that Q^ h = Qh%> which is equivalent to ip^ e \g) = 
ip( e \—g). Because exponentiation preserves symmetries of the matrices, we will 
also have 

PH = f {e \h-g) 

for some probability |G|-vector . Time reversibility similarly implies (g) = 

The discrete Fourier transform is constructed via the "dual group" of an 
Abclian group. The elements of G, the dual group to G, are the homomorphisms 
of G to the multiplicative group of complex numbers of magnitude one. The 
groups G and G are isomorphic; such an isomorphism is canonical after choosing 
an identification of G with a direct product of finite cyclic groups. We make 
such a choice, and because of the resulting isomorphism we will use the same 
letters g,h, . . . to denote elements of G and G. However, we will follow [15] in 
using "hat" for the application of an element of the dual group, such that g(h) 
is the application of g E G to h E G. (This conflicts with traditional notation 
for Fourier transform; we will use "check" for this purpose as defined below.) 
We also note that because G is isomorphic to a direct product of cyclic groups 
we have g(h) = h(g). 

The Fourier transform of a function a : G — > C is 

%) : = 9(h)a(h). 

heG 

By the definitions /^(O) = 1 for any e. Note 

/ (£) (-.9) = E -9Wf (e) (h) = 9{-h)f {e \h) 

heG heG , 

= E 9{h).f^(-h) = ]T g{h).f ie) {h) = 

heG heG 

where the fourth equality is by time reversibility. By the definition of the Fourier 
transform, a{—g) = a(g) for any real-valued function a. Thus the fact that 
f^ e \d) = f^ e \—g) is equivalent to the fact that f^ e \g) is real. 

The formulas for the phylogenetic Fourier transform are simplified by re- 
rooting the tree at a leaf, which eliminates the need for a special root distribution 
[5, 14]. Specifically, we extend an edge from the root terminating in a new leaf; 
the previous root distribution is then replaced by a transition matrix along the 
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new edge. Thus, without loss of generality, we assume our given tree on m 
leaves is rooted at a leaf and that the root distribution puts all mass at the 
identity. 

Phylogenetic Fourier conjugation is an invertible transformation between 
the collection of edge parameters ip^ e \g) and the corresponding site-pattern 
frequency vector for a given tree. This site-pattern frequency vector is the joint 
distribution of states at the leaves defined as follows. Start at the root, and 
move towards the leaves, changing state along an edge e according to P^K The 
induced joint distribution on the leaves will be denoted p where the component 
p s of p is the probability of seeing g £ G m by the above process. 

The Fourier transform of the p vector using the group G™ will be denoted 
q. The matrix representation of the Fourier transform for the group G will be 
denoted K, i.e. Kg^ := g(h) for any g,h G G. The analogous matrix for G rn 
will be denoted H. Note that H is the m-fold Kronecker product of K. In this 
notation, q = Hp. We note that when K (and thus H) is a matrix with entries 
±1, the Fourier transform is often called the Hadamard transform. 

Following [5], use A(e) to denote the set of leaves i such that the path from 
i to the root goes through e; A(e) can be thought of the set of leaves "below" 
e. We also define 

ieA(e) 

The vector *g is a natural lift of a g 6 G m to an assignment of G to the edges of 
the tree. We will be using two versions of Fourier conjugation. In this notation, 
version one can be written 

Theorem 2.1 (Hcndy, 1989 [18]; Evans and Speed, 1993 [19]). 

? g =n/ (e) (*5e)- (4) 



The second version of the edge-parameter inequalities will use a different 
version of the Fourier conjugation. In order to express this second version, we 
state the following lemma: 

Lemma 2.2. 

f(h) = exp$(h)). 

Proof. We begin as for Lemma 17.2 of [15] (though for right rather than left 
eigenvalues), 

(QK) gth = g)x(h) = 1>(v)v + 9(h) 

(5) 

= g(h) ]T 4>(y)y(h) = K g , h $(h). 

Thus the ft-th column of if is a right eigenvector of Q with eigenvalue ip(h). The 
same argument with / in place of ip shows that the hth column of if is a right 
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eigenvector of P with eigenvalue f(h). However, P = cxp(Q) so the eigenvalues 
of P are the exponentials of the corresponding eigenvalues of Q. □ 

As noted in the discussion after ([3]), 4>(g) is rea l f° r anv 9- Thus Lemma [2T2l 
implies 

Observation 2.3. .Ani/ edge with real edge parameters will have real and non- 
negative Fourier transform f^. 

Thus any tree with non-negative edge parameters has "biologically mean- 
ingful" parameters in the language of [2], though the converse does not hold. 
We also note that by Q the q s are real; thus the logarithm in ([7]) retains its 
usual meaning as a mapping between real numbers. 

We will now present a second version of Fourier conjugation. By Lemma l2.2l 
and the definition of Fourier transform, 

4,{h)=[K- 1 log Kf] h (6) 

where the subscript h denotes the h component of the vector. The following 
theorem is Theorem 6 of [14] in the presence of ([6]). 

Theorem 2.4 (Szekely, Steel, and Erdos, 1993). Let p(e,h) be the element of 
G m which assigns h to all leaves in A(e) and to all others. Then 

^(h) = [H- 1 \o gq \ p{eJiy (7) 



Note that the log in equation ([7]) is entry-wise. 

3 Fourier transform inequalities: path version 

In this section we show first that one can very easily extract specific f^ e \g) 
terms by taking ratios of certain q s terms. Then basic inequalities for the 
f^ e \g) terms will lead to inequalities in the q s . Let p(i, j) be the set of edges on 
the path between nodes i and j in the tree (i and j may or may not be leaves) . 
Now define 

F{i,r,g)= II / (e) (s). 

eep(ij') 

We record the following facts for future use: 
Lemma 3.1. 

(i) Let v be a node on the path from i to j in a tree. Then 

F{i,v\g) ■ F(v,j;g) = F(i,j;g). 

(ii) F(i,j;g) = F(j,i;g). 
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(iii) F(i,j;g) = F(i,j; -g). 

Proof. Parts (i) and (ii) are clear from the definition. Equation Q implies 
(iii). □ 

The following fact is a simple application of the above lemma and Theo- 
rem o 

Lemma 3.2. Let i and j be leaves and let g have gi — h, gj = —h and all other 
components zero. Then q s = F(i,j; h). 

The first identity is for pendant edges. Denote the set of leaves by Jz? . 

Proposition 3.3. Given some pendant edge e, let i denote the leaf on e and let 
v be the internal node on e. Pick j and k any leaves distinct from i such that 
the path p(j,k) contains v. Let w(gi,gj,gk) <E G^ assign state g x to leaf x for 
x G {i,j, k} and the identity to all other leaves. Then 

Qw(h,-h,Q) ■ Qw(-h,0,h) ^ 
<lw(Q,-h,h) 



f (e) {h) 



Proof. Lemmas 13.11 and 13.21 show 

<l w (h-hfl) = I {e) (h) -F{v,j;h) 
q w (-h,o,h) = f (e) (h) ■ F{v, k; h) 
q w (o-h,h) = FivJlh) ■ F{v,k;h). 

□ 

A similar proof implies the next identity, which is for internal edges. 

Proposition 3.4. Pick some internal edge e; say the two nodes on either side 
of e are v and v' . Choose i,j (resp. i',j') such that p(i,j) (resp. p(i',j')) 
contains v but not v 1 (resp. v' but not v). Let z(g il g J ,g i , ,gy ) e G* assign 
state g x to leaf x for x G {i,j,i',j'} and the identity to all other leaves. Then 

Qz(hfl,-h,0) ' Qz(0,-h,0,h) ^ 
Qz(h,-h,0,a) ' Qz(Q,0,-h,h) 



f [e \h) 



Now, constraints on the f^ e \h) will imply inequalities in the q g . Such non- 
trivial constraints exist; we review these constraints now for the usual group 
based models. First we investigate the two-state symmetric (CFN) model, 
which was described in the introduction. There is only one non-trivial com- 
ponent f( e '(l) of the Fourier transform along an edge, which is exp(— 27(e)), 
where 7(e) is the "branch length" of that edge. Now < 7(e) is equivalent to 

/W(1)<1. (10) 

Inserting the values for /( e )(l) from Propositions 13.31 and 13.41 into this equation 
give the edge-parameter inequalities for each edge. In summary, 
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Proposition 3.5. Assume that q is the ^-Fourier transform of a site-pattern 
frequency vector under the CFN model. If q satisfies a complete set of phyloge- 
netic invariants for a tree 2? and a set of inequalities gained by substituting an 
instance of {3P or (0) into the square of U0\) for each edge e of 2? ', then q is the 
expected site-pattern frequency vector of 2F for some assignment of non-negative 
branch lengths to 2?. Conversely, any tree with non-negative branch lengths will 
satisfy such a set of inequalities. 

As a quick application, we demonstrate how these inequalities exclude the 
pathological example described in the introduction. For the internal edge of this 
quartet tree under the CFN model, we should have 



9110090011 



/(») (i; 



2 



< 1. 



However, by substituting in values from Table Q] the above ratio is 6~ 2 , which 
is greater than one. 

For the four-state models we will only discuss the Kimura three parameter 
(K3P) model. It is the most general group-based four-state model; results for 
this model extend to less general models by choosing transition matrices with 
extra symmetries. The K3P model is associated with the group Z2 x Z2. Thus 
K for this model is the Hadamard matrix of order four, which is the Kronecker 
product of two Hadamard matrices of order two. We make the identifications 

A =(0,0) (7 =(1,0) G=(0,1) T=(l,l). (11) 

We write the column vector ip as 

1-mC) + ip(G) + iKT)), V(C), V(G), V(T)] T 

Then by Lemma Owe have that f (e) (A) = 1 and 

/( e )(C)=exp(-2(^(C)+^(T))) 

f^(G) = exp(-2(V>(G) + V(T))) (12) 
/( e )(T)=exp(-2(^(C)+V(G))). 

The following equations are equivalent to requiring if)(C), tp(G), and ip(T) to 
be non-negative via (fl2|) : 

(13) 
(14) 
(15) 

In summary, 

Proposition 3.6. Assume that q is the Z2 x Z2 Fourier transform of a site- 
pattern frequency vector under the K3P model. If q satisfies a complete set of 



/« 


(C)f {e \T) 


< 


/(e) 


/« 


(G)f^(T) 


< 


/(e) 


/(e) 


(C)/ (e) (G) 


< 


/(e) 



10 



phylogenetic invariants for a tree 2F and a set of inequalities gained by substi- 
tuting an instance of (0) or (0) into the square of U3\) , jl4\ ), an d U5\) for each 
edge e of & ', then q is the expected site-pattern frequency vector of ST for some 
assignment of non-negative edge parameters to ST . Conversely, any tree with 
non-negative edge parameters will satisfy such a set of inequalities. □ 

For example, say we substitute © into the square of (|13| . This gives 

ffz(C,0,C,0) • 1z(Q,Cfi,C) a z(T,Q,Tfi) ' gz(0,T,0,T) Qz(G,0,Gfi) ' gz(0 : G,0,G) 
Qz(C,C,0,0) ' Qz(0,0,C,C) Qz(T,T,0,0) ' Qz(0,0,T,T) ~ 9z(G,G, 0,0) ' Qz(0,0,G,G) 

which is equivalent to a monomial inequality of degree six. 

Before moving on, we highlight that ([5]) is essentially concerned with induced 
subtrees on only 3 taxa, and ([9]) is concerned with induced subtrees on only 4 
taxa. Inequalities on the collection of these subtrees imply positivity of edge 
parameters for the entire tree. 

4 Fourier transform inequalities: canonical ver- 
sion 

The previous section described a relatively simple set of inequalities which can be 
computed for any edge of a tree. However, some readers may feel uncomfortable 
with the fact that these inequalities involve some arbitrary choice. In this section 
we give a "canonical" version of the edge parameter inequalities which is a simple 
consequence of Theorem 12 .41 This version of the inequalities also gives a clearer 
understanding of the underlying geometry. 

We now specialize to the case of either the CFN model or the K3P model 
(this again includes K3P with extra symmetries, such as JC DNA and K2P). In 
these cases, the entries of the Fourier transform matrix K are ±1. 

Proposition 4.1. LetG = Z 2 or Z 2 xZ 2 and p{e,h) be the element of G m which 
assigns h to all leaves in A(e) and to all others. Then for any q generated on 
a tree with non-negative edge parameters, 

n % > n 9s ( ie ) 

g:p(e,h)(g) = l g:p(e,h)(g) = -l 

Conversely, any tree (with edge parameters) whose q satisfies 116\) for any e 
and h has non-negative edge parameters. 

Proof. Recall that H' 1 = \G\ m H. Thus Q is 

\G\- m ^\h) = [Hlogq\ p{eth) , (17) 

the left hand side of which is non-negative by our main assumption. Exponen- 
tiate (fl~7|k the left hand side will be not less than one, and the right hand side 
becomes a ratio with those q g such that p{e, h)(g) = 1 on top and those q s such 
that p(e,h)(g) = — 1 on the bottom. Then multiply to clear denominators. □ 



11 



Although we have specialized to groups where K has real entries, we note 
here that equivalent (though more complex) such inequalities exist in all cases. 
First, we claim that qh = g_h for any h. Indeed, assuming time reversibility we 
have f^ e \g) = f^ e \—g), thus qh — q-h by (TJ|. It follows that the coefficients 
of the qh in £f _1 q are real. Therefore the same exponentiation process in 
Proposition 14. II works, although the qh may now have exponents different than 
±1. 

The "path" inequalities of Propositions 13.31 and 13.41 and the "canonical" 
inequalities of Proposition 14.11 are equivalent. Indeed, they each express the 
equation tp^ih) > for various e and h. However, the expressions are different, 
but by the definition of invariants one can go from one to the other formulation 
via a complete set of phylogenetic invariants [5] . 

The previous paragraph establishes equivalence between the two formula- 
tions in principle; we present an example here to show how the transformation 
works. Assume a quartet tree of topology 12|34; use notation as in the intro- 
duction. First we investigate the pendant edge leading to taxon 1. By P^|) . 
that edge having non-negative edge length is equivalent to 

9oooo 9ono 9oon 9oioi > 9noo 9ioio 9iooi 9im- (18) 

A couple of algebraic steps using the phylogenetic invariant 9noo9oon = 9im 
and the fact that 90000 = 1 shows that (fig)) is equivalent to 



1 > 



91100 91001 
9oioi 



/ 911QQ 91010 \ 
V 9ono / 



which is the product of the two "path" pendant edge length inequalities. Simi- 
larly, the internal edge being non-negative is equivalent to 

^ > 9ioio 9oioi 9iooi 9ono ( 9ioio 9oioi \ ( 9iooi 9ono 



9oooo 9im 9noo 9oon \ 9noo 9oon / \ 9noo 9oon 

where the right hand side of the equality is the product of the two "path" 
internal edge length inequalities. 

The canonical construction generalizes the inequalities to the more general 
setting of group-based mutation models on split networks as formulated by 
David Bryant [15]. Assume the set of splits is labeled X. In his elegant formula- 
tion, one assigns mutation probabilities to each possible split, i.e. a probability 
distribution on the group G for each split. Assuming independence of these dis- 
tributions, one gets a probability distribution on G s by multiplication. From 
there the probability of a single site-pattern h (i.e. the assignment of a group 
element to each taxon) is the sum of the probabilities of all elements of G s 
which give h on the leaves. 

Fourier conjugation also works in this setting. Although Bryant's paper [15] 
only develops the conjugation in the case of models with a fixed rate matrix and 
"branch length" varying among splits, there is also an invertible transformation 
for the setting where one allows the whole rate matrix to vary. We will apply 
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this extended version and call the set of ip^ for splits e "split parameters" 
analogous to the edge parameters we have been describing so far. Although we 
do not go into details here, the proof of the Fourier conjugation formula in the 
extended case is similar to that in [15]. One can then obtain an equation for the 
Fourier conjugation written exactly as in {?]) but with a generalized definition of 
the terms: "root" the splits network at the taxon n, and so redefine A(e) to be 
all of the taxa on the opposite "side" of the split from n. For example, A(12|34) 
is the set {1, 2} as in this case n = 4. 

Definition 4.2. Let Yq^ be the points of ' q- space which satisfy inequalities HI 6}) 
for each split e and each h G G. 

Observation 4.3. 

(i) YG,m is the image of the non-negative split parameter splits networks under 
Hadamard conjugation. 

(ii) Ycm is contractible. 

(Hi) The points of q-space corresponding to trees of topology 2? with non- 
negative edge parameters are the zero set of the phylogenetic invariants 
for 3~ intersected with Yc^ m . These points sit on the boundary ofY(j_ m 
for m > 3. 

Proof. We note that Y^m is the (injective) image of the set of non-negative split 
parameter vectors in (R>o) 2 (I ! -1 ). For (i), the inequalities (| 16[) precisely 
specify positivity of split parameters. For (ii) the required homotopy simply 
uniformly shrinks every split parameter to zero. The first sentence of (iii) is 
equivalent to Proposition 14.11 For the second sentence, the boundary of Ya^ m 
consists of the image of splits networks with at least one zero split parameter. 
Phylogenetic trees are simply split networks such that only a compatible set of 
split parameters are nonzero. □ 

This series of observations suggests that rather than phylogenetic "orange" 
[4] with one orange slice for each tree topology, one might think of a phylogenetic 
"soccer ball" with one panel of the soccer ball for each tree topology. Indeed, the 
set of Fourier transformed points corresponding to any tree live on the boundary 
of a higher dimensional contractible object. However, it should be noted that 
not every point of the boundary of Yc;,m corresponds to a phylogenetic tree, 
and in fact the panels are of strictly lower dimension than the boundary of the 
soccer ball. 

Furthermore, we now show that the soccer ball Yc, m is not convex. Recall 
that f^ e \g) is real by the discussion after ([3]). Then: 

Lemma 4.4. The components of the Fourier transformed mutation probability 
vector /( e ' (g) are less than or equal to one for any edge e with non-negative 
edge parameters. 
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Proof. By Lemma [2.21 it suffices to show that fo e \g) is non-positive. By the 
definition of tp, 

<K0) = -$>(<?) 

which implies that i/^ e ) (g) is non-positive by the definition of the discrete Fourier 
transform. □ 

Proposition 4.5. YG,m is not convex for m > 3 and G — Z2 or Z2 x Z2. 

Proof. We report the argument for the case of G = Z2 x Z2 (i.e. K3P); the case 
of G = Z 2 is analogous but easier. We label the sates A, C, G, T as in (fTTj) . Pick 
an arbitrary tree Jonm taxa; Find a cherry (two-taxon rooted subtree) of & 
and label the leaves of & with 1, 2. Number the edge leading to taxon 1 with 1, 
the edge leading to taxon 2 with 2, and the edge meeting 1 and 2 with 3. Pick 
arbitrary < Q\ , 9 2 , #3 < 1 such that 

6i9 2 < 6 2 3 ((e 1 +e 2 )/2) 2 ; (19) 

this is easily achieved by fixing 2 and 63 and taking 61 to be small. 

We will construct two vectors q',q" G Yg, to such that q := (q' + q")/2 is 
not in Yc : m- The vectors q' and q" will be defined via the Fourier transform 
by specifying their f^ e \g). It can be checked that q' and q" sit in Yb.m using 
Lemma |2.2| then taking the logarithm and the inverse Fourier transform. 

Let V — {G, T}. For q' set 

/ (1) G?) = 0i f (2 \g) = e 2 f {3) (g) = e 3 

for g 6 V, and f^ (g) = 1 otherwise. For q" set 

f [1 \g) = e 2 p\g) = e 1 p\g) = e 3 

for g £ V, and f^ (g) = 1 otherwise. 

We claim that q violates (fTr?)) with e = 3 and h = C, and thus does not sit 
in Yq m . To establish this claim, we calculate each side of (IT51) . First note that 

C(g) = —1 for g 6 V and is 1 otherwise. Thus p(3, G)(g) = —1 exactly when 
\{gi, g 2 } Pi V\ is odd, and is 1 otherwise (here and below the notation denotes 
the ith-taxon component of g) . 

Define q u ( XlyX2 ) to be q s for any g such that g\ = x\ and g 2 — x 2 . This 
Qu{x 1 ,x 2 ) is well defined via (g]) because all f^ e \g) = 1 except when e = 1,2,3. 
Noting that C + C = 0, we see that q u (c,C) = ^1^2 by ((4|). Similarly, 

Qu(C,A) — q-u(A,C) = ^3(^1 + #2)/2. 

Because we have arranged that f {e) {A) = f {e) {G) = 1 and / (e) (G) = / (e) (T) 
for both q' and q", there are three cases for q u ( Xl ,x 2 )- If £1 and x 2 are in V then 
if |{a;i,i 2 }n^| is one then Qu(xi 7 x 2 ) — Qu(c 7 A)- 

If neither x± 

nor £2 are in V then q u ( Xl . X2 ) = 1. 
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Thus (HHJ) is in this case 




Taking both sides to the power of 4 1 m and substituting gives 

0102 > 2 3 ((9 1 +e 2 )/2) 2 , 
violating ([19]). □ 

Proposition ^. 5l has an interesting phylogcnctic interpretation along the lines 
of [20] : there are mixtures of two site pattern frequency vectors corresponding to 
trees such that the splits network corresponding to the mixture has negative edge 
parameters. However, the trees used in the proof had many edge-parameters 
zero; this is not strictly necessary though it greatly simplifies the proof. 



5 Consequences and Conclusions 

In summary, we have presented a collection of inequalities in the Fourier trans- 
formed site-pattern frequency space which are equivalent to the assumption 
that group-based mutation rate matrices have non-negative off-diagonal entries. 
We are motivated in part by the idea of formulating maximum likelihood as 
a constrained optimization problem [5,9]. We noted in the introduction that 
the previously known constraints are not sufficient to ensure that the result of 
the constrained optimization is in fact a proper tree. As described in Proposi- 
tions 13.51 13.61 and 14.11 our inequalities complete the set of constraints: if a q 
satisfies a complete set of phylogenetic invariants and the inequalities described 
here, then it does indeed correspond to a bona fide tree. Thus phylogenetic in- 
variants along with the edge-parameter inequalities could indeed be safely used 
to formulate maximum-likelihood phylogcnctic estimation as a constrained op- 
timization problem. 

We also defined YG, m , which is the set of q which come from splits networks 
with non-negative edge parameters. We noted that the tree images for each 
tree topology sit on the boundary of YG,m- Here we showed that Yc >m is not 
convex at a number of points. Note that because Yc >m is cut out by monomial 
inequalities ()16|) one would expect that Y^m would be non-convex at "most" 
points. 

As the edge-parameter inequalities are the second component of the con- 
straints for phylogenetic trees, one might wonder if they could be used for phy- 
logenetic inference in an manner analogous to phylogenetic invariants [3,10]. In 
a sense these inequalities appear more natural than phylogenetic invariants for 
the purpose of determining the tree corresponding to a data set: given a real- 
world data set, one might actually hope that the inequalities presented here 
could be satisfied, whereas phylogenetic invariants (which are equalities) will 
essentially never be. Using the terminology above, one might hope that data 
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would sit in the interior of Yg )TO even though one would never expect data to 
sit on its boundary. 

This hope is not justified for simulated data on a tree. Indeed, one can 
think of the simulated data points as some distribution centered on the expected 
distribution. Recall that the set of trees are simply the set of splits networks 
with some edge parameters set to zero. If the simulation distribution does 
not have support on some lower-dimensional surface, then the pre-image of 
the distribution will almost certainly have points with negative coordinates in 
parameter space. Said another way, it is improbable that a sample from a 
distribution centered on a "corner" of the boundary of Ig,™ would sit in the 
interior of Yc.m- As an example one might look at Figure 17.1 of [7] where 
negative split parameters (besides that for the trivial split) are encountered in 
a simulation. Despite these challenges, edge-parameter inequalities may well 
prove useful for inference. 

We acknowledge that all of the work presented here is for group-based mod- 
els. This is a rather strong restriction as all group-based models must have 
uniform stationary distribution; real data sets rarely have this feature. Presum- 
ably, there are inequalities corresponding to those presented here for non-group 
based models. However, as no Fourier transform is available for those models 
the formulation may be very complex. 
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